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SINUSOIDAL AUDIO CODING WITH PHASE UPDATES 



FIELD OF THE INVENTION 

The present invention relates to coding and decoding audio signals. 

BACKGROUND OF THE INVENTION 
5 A parametric coding scheme in particular a sinusoidal coder is described in 

PCT patent ^plication No. WO 00/79519-Al (Attomey Ref. PHN 017502) and PCT Patent 
Application No, IB/02/01297, filed 18.04.2001 (Attomey Re£ PHNL010252). In this coder, 
an audio segment or jGrame is modeled by a sinusoidal coder using a number of sinusoids 
represented by amplitude, frequency and phase parameters. Once the sinusoids for a segment 

10 are estimated, a tracking algorithm is initiated. This algorithm tries to link sinusoids with 
each other on a segment-to-segment basis. Sinusoidal parameters from s^ropriate sinusoids 
firom consecutive segments are thus linked to obtain so-called tracks. The linking criterion is 
based on the frequencies of two subsequent segments, but also amplitude and/or phase 
infomiation can be used. This mformation is combined in a cost function that determines the 

1 S sinusoids to be linked. The tracking algorithm thus results in sinusoidal tracks that start at a 
specific time instance, evolve for a certain amount of time over a plurality of time segments 
and then stop. 

In practical implementations of such prior art coders, for a sinusoidal track, 
only the initial phase is transmitted by the coder and in the decoder, the continuous phase of a 
20 sinusoid in a sinusoidal track is calculated from the phase of the originating sinusoid and the 

firequencies of the intermediate sinusoids. So, for example, the continuous phase (^^ ) of 
sinusoid k in the track can be calculated as: 

^* = ^^^znQk^x Equation 1 

25 

where L is the update interval of the firequencies (in sec), typically in the order of 10 ms, and 
fk and>3fc.y are the quantized firequencies (in rad/s) of firame k and k-1, respectively. The 
fimction mod represents the modulo operation which maps onto the interval between -tc and 
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7c. Furthermore, the initial phase (k=l) is: ^| ^ where is the measured and quantized 
phase of the originating sinusoid in a track. Other phase continuation functions are also 
possible as indicated in European Patent Application No. 01204062.2 filed on 26 October 
2001 (Attorney Docket No. PHNL010787) where a warp factor can be determined by the 
S coder and used in linking tracks as well as in the decoder in the calculation of continuous 
phases. 

Nonetheless, especially for long tracks, it is likely that the continuous phase 
will diverge from the measured phase to the extent that they do not resemble one 
another. This divergence can be introduced by inaccuracies in the estimation of the 

1 0 frequencies, the quantization of Hie frequencies and the initial phase or the linear continuation 
of the phase. For an individual sinusoidal track, this diverg^ce might not be audible. 
However, in natural audio, the phase relation between sinusoidal tracks can be important. As 
such, the loss of phase synchronization between tracks can introduce artefacts like double 
speaker effect, metallic sound etc. 

1 S The loss of phase synchronization between tracks is illustrated quantitatively 

in Figure 4. In this figure, the top trace shows a part of a waveform generated by a German 
male speaker. The middle trace shows the waveform of a corresponding sinusoidal signal 
generated using a prior art encoder/decoder and the bottom trace shows the difference 
between the original and the sinusoidal signal. As can be seen from the error signal, the 

20 sinusoidal signal does not match the original signal. 

The present invention attempts to nodtigate this problrai. 

DISCLOSURE OF THE INVENTION 

According to the present invention Hi^e is provided a method according to 

25 claim 1. 



In the prior art, especially in the case of long tracks decoded with only 
continuous phase information, the divergence between the continuous and originally 
measured phase will be large. The phase update method according to the present invention 
30 largely removes artefacts introduced by tracks encoded and decoded with a continuous phase. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows an embodiment of an audio coder according to the invention; 
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Figure 2 shows an embodiment of an audio player according to the invention; 

Figure 3 shows a system comprising an audio coder and an audio player 
according to the invention; 

Figure 4 shows an original waveforai (top trace) compared to sinusoidal signal 
S with continuous phase (middle trace) generated with a prior art encoder/decoder and the error 
signal (bottom trace); 

Figure S shows an original waveform (top trace) compared to sinusoidal signal 
with phase update (ndddle trace) generated with an encoder/decoder according to a preferred 
embodiment of the present invention and the error signal (bottom trace); and 
10 Figure 6 shows the distribution of phase difference (A) for a German male 

speaker excerpt. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 

In a preferred embodiment of the present invention. Figure 1, the encoder is a 

15 sinusoidal coder of the type described in WO 01/69593-Al (Attorney Ref PH-NL000120). 
The operation of this coder and its corresponding decoder has been well described and 
description is only provided here where relevant to the present inventioiL 

In both the earlier case and the preferred embodiment, the audio coder 1 
samples an input audio signal at a certain sampling frequency resulting in a digital 

20 representation x(t) of the audio signal. The coder 1 then separates the sampled input signal 
into three components: transient signal components, sustained deterministic components, and 
sustained stochastic components. The audio coder 1 comprises a transient coder 11, a 
sinusoidal coder 13 and a noise coder 14. The audio coder optionally comprises a gain 
compression mechanism (GC) 12. 

25 The transient coder 1 1 comprises a transient detector (TD) 1 10, a transient 

analyzer (TA) 111 and a transient synthesizer (TS) 1 12. First, the signal x(t) enters the 
transient detector 110. This detector 110 estimates if there is a transient signal component 
and its position. This information is fed to the transient analyzer 1 1 1 . If the position of a 
transient signal component is determined, the transient analyzer 111 tries to extract (the main 

30 part of) the transient signal component. It matches a shape function to a signal segment 

preferably starting at an estimated start position, and determines content underneath the shape 
function, by employing for example a (small) number of sinusoidal components. This 
ioformation is contained in the transient code CT and more detailed ioformation on 
generating the transient code CT is provided in WO 01/69593-Al. 
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The transient code CT is furnished to the transient synthesizer 1 12. The 
synthesized transient signal component is sxibtracted from the input signal x(t) in subtractor 
16, resulting in a signal xl. In case the GC 12 is omitted, xl = x2. 

The signal x2 is furnished to the sinusoidal coder 13 where it is analyzed in a 
sinusoidal analyzer (SA) 130, which detemiines the (deterministic) sinusoidal components. It 
will therefore be seen that while the presence of the transient analyzer is desirable, it is not 
necessary and the invration can be implCTiented without such an analyzer. In any case, the 
end result of sinusoidal coding is a sinusoidal code CS and a more detailed example 
illustrating the conventional generation of an exemplary sinusoidal code CS is provided in 
PCT patent appUcation No. WO 00/795 19-Al (Attorney Ref: PHN 017502). 

In brief, however, such a sinusoidal coder encodes the input signal x2 as tracks 
of sinusoidal components linked from one frame segment to the next. From the sinusoidal 
code CS generated with the sinusoidal coder, the sinusoidal signal component is 
reconstructed by a sinusoidal synthesizer (SS) 131 . This signal is subtracted in subtractor 17 
from the input x2 to the sinusoidal coder 13, resulting in a r^aining signal x3 devoid of 
(large) transient signal components and (main) deterministic sinusoidal components. 

Tracks are initially represented by a start frequency, a start amplitude and a 
start phase for a sinusoid beginning in a given segment - a birth. As disclosed in European 
Patent Application No. 02077727.2 filed 8 July 2002 (Attorney Docket No. PHNL020598), a 
start phase may be dropped for very short tracks. In such cases, the decoder uses a random 
start phase when synthesizing the starting segments of short tracks. 

Lot any case, after a birth, the track is represented in subsequent segments by 
frequency differences and amplitude differences (continuations) until the segment in which 
the track ends (death). In practical implementations of prior art encoders, for long or short 
tracks, phase information is not encoded for continuations at all and phase information is 
regenerated using continuous phase reconstmction. This is done because transmission of 
phase information significantly increases the bit rate. 

According to the present invention, in order limit divergence between the 
phase ( {^jt ) measured by the sinusoidal analyzer 130, when analyzing a signal, and the 

continuous phase (^^^ ) generated by both the encoder synthesizer 131 and the corresponding 
decoder synthesizer 32 when synthesizing the signal, for every n*^ frame in a track, the 
siniisoidal analyzer 130 generates a phase update. In the preferred embodiment, n is 4. (If a 
track is shorter than n frames, no phase update is applied and only the first phase may be 
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transmitted.) Thus, in the synthesizers 131, 32, the phase can only diverge within these n 
frames, after which the phase is restored again. 

In a jSrst embodiment, during the life of a track, the analyzer 130 periodically 
quantizes the measured phase {^f^) and includes this value m the sinusoidal code (CS) 
5 transmitted to the decoder. Typically, tiie phase can be accurately and uniformly quantized 
using 5 bits. It is acknowledged that the phase update requires additional information to be 
transmitted to the decoder. For a typical set of test signals (audio and speech), the bit rate 
with phase update for n 4 will increase, depending on the excerpt, by 1-3 kbit/s for a 24 
kbit/s sinusoidal coder. 

10 It will be seen that there are several ways to transmit the phase update to the 

decoder. In the first embodiment, the measured phase is quantized in the same manner as is 
used to determine the phase of the first sinusoid in a track. For the sinusoid where the phase 
update occurs, i.e. every n firames, this quantized phase i^j^)is transmitted. 

A second method to transmit the phase update to the encoder is to quantize 

IS phase differences for each update point. Thus, the difference between the measured phase 
and the continuous phase, denoted by Ak, is computed by: 

= mod^^i^it ) Equation 2 

20 where ^ is defined by Equation 1, k is the iaczmc number in the track and ^ rq)resents the 
quantized phase. For example, the difiference Ak is calculated when k-1 is a multiple of n. For 
n=4, this means that a phase update happens for frame 1, 5, 9, etc. where phase difference Ak 
is transmitted to the decoder. 

In Figure 6, the distribution of A of the second embodiment for a German male 

25 speaker is shown. Due to the peaked distribution around a small range of A values, a non- 
uniform quantization (entropy coding) can be applied such that less than 5 bits per update can 
be used to provide the same accuracy as the first embodiment Furthermore, quantization 
methods similar to those used in Adaptive Differential Pulse Code Modulation (PCM) can be 
used. In ADPCM, instead of coding an absolute measurement at each sample point, it codes 

30 the difference between samples and can dynamically switch the coding scale to compensate 
for variations in amplitude and frequency. Thus, in the present case, adaptive predictors 
(based on phase continuation) can be used to vary the phase or phase difference quantization 
scale. Also, the update rate of the phase, indicated by n, can also be made frequency 
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dependent. For high frequencies, a hi^er phase updated (smaller n) can be used than for the 
lower frequencies (higher n). 

In any case, the signal x3 remaming after sinusoidal analysis including taking 
into account phase updates is assumed to mainly comprise noise and the noise analyzer 14 of 
the preferred CTobodiment produces a noise code CN representative of tiiis noise, as described 
in, for example, PCX patent appUcation WO 01/89086-Al (Attorney Ref: PHNL000287). 
Again, it will be seen that the use of such an analyzer is not essential to the implementation 
of the present invention, but is nonetheless complementary to such use. 

Finally, in a multiplexer 15, an audio stream AS is constituted which includes 
the codes CT, CS and CN. The audio stream AS is frimished to e.g. a data bus, an antenna 
system, a storage medivmi etc. 

Fig. 2 shows an audio player 3 according to the invention. An audio stream 
AS', e.g. generated by an encoder according to Fig. 1, is obtained from the data bus, antenna 
system, storage medium etc. The audio stream AS is de-multiplexed in a de-multiplexer 30 to 
obtain the codes CT, CS and CN. These codes are frimished to a transient synthesizer 31, a 
sinusoidal synthesizer 32 and a noise synthesizer 33 respectively. From the transient code 
CT, the transient signal components are calculated in the transient synthesizer 31. In case the 
transient code indicates a shape fimction, the shape is calculated based on the received 
parameters. Further, the shape content is calculated based on the frequencies and amplitudes 
of the sinusoidal components. If the transient code CT indicates a step, then no transient is 
calculated. The total transient signal yT is a sum of all transients. 

The sinusoidal code CS is used to generate signal yS, described as a sum of 
sinusoids on a given segment. In prior art decoders, in order to decode the frequencies, the 
continuous phase of a sinusoid in a sinusoidal track is calculated from only the phase of the 
originating sinusoid and the frequencies of the intermediate sinusoids. 

In the decoder of the preferred embodiment, either the transmitted quantized 
phase is used to compute the phase difference Ak or the phase difference Akis derived 
directly from the bitstream. 

The synthesizers 131, 32 of the preferred embodiments also take into account 
the possibility of '*phase jumps", A phase jump occurs if the difTerence between two 
consecutive phases within a track is large. This can lead to artefacts such as a click. 
Therefore, in the preferred embodiment, the synfliesizers 131, 32 spread the difference 
between the measured and the continuous phase over the n frames and so, in this case, only a 
small phase correction per sinusoid is made, such that large phase jumps are avoided. 
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Thus, the Ak is then spread over the current frame and the n-1 preceding 
frames. This can for exanople be done in a linear fashion: 



A'=-^ Equations 
n 



y/hetc K-n <k < K, where K is the number of the frame in the track where the phase update 
happens. Other methods are also possible. For example: 



^ {K-k + nyA^ Equation 4 

* (n + l)./i/2 



where K-n < k < KL In this case, more phase correction is applied to sinusoids closer to the 
phase update point. 

Thus, when synthesizing the sinusoidal components of a signal according to 
the preferred embodiments of the invention, the continuous phase is calculated by taking into 
account the int^olated phase differences A' from either Equation 4 or 5 that are needed to 
update tiie phase: 



^* = +|C/* + a; Equation 5 



By updating the phase on a regular basis and interpolating tibie phase difference 
over the sinusoids in the track, the match between the original signal and the sinusoidal signal 
with phase update (here n = 4) is improved. This is shown in Figure 5 where it can be seen 
that the error signal (bottom trace) between the original signal (top trace) and the sinusoidal 
signal (middle trace) is much reduced compared to Figure 4. 

At the same time, as the sinusoidal components of the signal are being 
synthesized, the noise code CN is fed to a noise synthesizer NS 33, which is mainly a filter, 
having a frequency response approximating the spectrum of the noise. The NS 33 generates 
reconstructed noise yN by filtering a white noise signal with the noise code CN. 

The total signal y(t) comprises the sum of the transient signal yT and the 
product of any amplitude decompression (g) and the sum of the sinusoidal signal yS and the 
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noise signal yN, The audio player comprises two adders 36 and 37 to sum respective signals. 

The total signal is furnished to an output unit 35, which is e.g. a speaker. 

In the preferred embodiments above, the phase update is described as applying 

to the n frames received prior to the update. It will be seen, however, that the invention is 
5 equally applicable to including flie phase update information at the beginning of the n frames 

to which ihe update applies. In this manner, the phase can be determined with an equation 

similar to Equation S as the information for tiie frame is received. 

FurOier variations are also possible including, for example, transmitting an 

indicator as to whetiier absolute phase values or phase differences are to be transmitted as 
10 phase update information. In a similar fashion the use of adaptive updating (varying n) could 

be signaled in the bitstream. Also, it may be desirable to indicate in the bitstream that for 

certain frequency ranges, no phase update information will be supplied, as it may be fo\md 

that using phase update information only benefits sound quaUty for particular frequency 

ranges. 

15 Fig. 3 shows an audio system according to the invention comprising an audio 

coder 1 as shown in Fig, 1 and an audio player 3 as shown in Fig. 2. Such a system offers 
playing and recording features. The audio stream AS is furnished from the audio coder to the 
audio player over a communication channel 2, which may be a wireless coxmection, a data 20 
bus or a storage medium. In case the communication channel 2 is a storage medium, the 

20 storage medium may be fixed in the system or may also be a removable disc, memory stick 
etc. The communication channel 2 may be part of the audio system, but will however often 
be outside tiie audio system. 

The present inv^tion can be used in any sinusoidal audio coder, where 
continuous phases are used. As such, the invention is applicable anywhere such coders are 

25 employed. 

It should be noted that the above-mentioned embodunents illustrate rather than 
limit the invention, and that those skilled in the art will be able to design many altemative 
embodiments without departing from the scope of the appended claims. In the claims, any 
reference signs placed between parentheses shall not be construed as limiting the claim. The 
30 word 'comprising' does not exclude the presence of other elements or steps than those listed 
in a claim. The invention can be implemented by means of hardware comprising several 
distinct elements, and by means of a suitably progranmied computer. In a device claim 
enumerating several means, several of these means can be embodied by one and the same 
item of hardware. The mere fact that certain measures are recited in mutually different 
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dependent claims does not indicate that a combination of these measures caimot be used to 
advantage. 



